<html>
<head>
<!--

    Copyright (c) 2010, 2021 Oracle and/or its affiliates. All rights reserved.

    This program and the accompanying materials are made available under the
    terms of the Eclipse Distribution License v. 1.0, which is available at
    http://www.eclipse.org/org/documents/edl-v10.php.

    SPDX-License-Identifier: BSD-3-Clause

-->

	<title>Design of XSOM</title>
	<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>
	<style>
		pre {
			background-color: rgb(240,240,240);
			margin-left:	2em;
			margin-right: 2em;
			padding: 1em;
		}
		p {
			margin-left: 2em;
		}
		dt {
			margin-top: 0.5em;
			margin-left: 2em;
			font-weight: bold;
		}
		dd {
			margin-left: 3em;
		}
	</style>
</head>
<body>

<h1 style="text-align:center">Design of XSOM</h1>
<div align=right style="font-size:smaller">
By <a href="mailto:kohsuke.kawaguchi@sun.com">Kohsuke Kawaguchi</a><br>
</div>

<p>
	This document describes the details you need to know to extend/maintain XSOM.
</p>

<h1>Design Goals</h1>
<p>
	The primary design goals of XSOM are:
</p>
<ol>
	<li>Expose all the information defined in the schema spec
	<li>Provide additional methods that helps simplifying client applications.
</ol>
<p>
	Providing mutation methods was a non-goal for this project, primarily because of the added complexity.
</p>


<h1>Building workspace</h1>
<p>
	The workspace uses Ant as the build tool. The followings are the major targets:
</p>
<dl>
	<dt>clean</dt>
	<dd>remove the intermediate and output files.</dd>
	
	<dt>compile</dt>
	<dd>generate a parser by RelaxNGCC and compile all the source files into the bin directory.</dd>
	
	<dt>jar</dt>
	<dd>make a jar file</dd>
	
	<dt>release</dt>
	<dd>build a distribution zip file that contains everything from the source code to a binary file</dd>
	
	<dt>src-zip</dt>
	<dd>Build a zip file that contains the source code.</dd>
</dl>

<h1>Architecture</h1>
<p>
	XSOM consists of roughly three parts.
	
	The first part is the public interface, which is defined in the <code>com.sun.xml.xsom</code> package. The entire functionality of XSOM is exposed via this interface. This interface is derived from a draft document submitted to W3C by some WG members.
</p><p>
	The second part is the implementation of these interfaces, the <code>com.sun.xml.xsom.impl</code> package. These code are all hand-written.
</p><p>
	The third part is a parser that reads XML representation of XML Schema and builds XSOM nodes accordingly. The package is <code>com.sun.xml.xsom.parser</code>. This part of the code is mostly generated by <a href="http://relaxngcc.sourceforge.net/">RelaxNGCC</a>.
</p>
<center>
	<img src="architecture.png"/>
</center>




<h1>Implementation Details</h1>
<p>
	Most of the implementation classes are fairly simple. Probably the only one interesting piece of code is the <code>Ref</code> class, which is a reference to other schema components.
</p><p>
	The <code>Ref</code> class itself is just a place hodler and this class defined a series of inner interfaces that are specialized to hold a reference to different kinds of schema components.
	
	The sole purpose of this indirection layer is to support forward references during a parsing of the XML representation.
</p><p>
	A typical reference interface would look like this:
</p>
<pre>
public static interface Term {
    /** Obtains a reference as a term. */
    XSTerm getTerm();
}
</pre>
<p>
	In case this indirection is unnecessary, all implementation classes of <code>XSTerm</code> implements this <code>Ref.Term</code> interface. This applies to all the other types of the <code>Ref</code> interface. Therefore, whereever a reference is necessary, you can stimply pass a real object. In other words, a direct reference (<code>XS***Impl</code>) can be always treated as an indirect reference (<code>Ref.***</code>).
</p><p>
	Implementations for forward references are placed in the <code>com.sun.xml.xsom.impl.parser.DelayedRef</code> class. The detail will be discussed later.
</p>



<h1>Parser</h1>
<p>
	The following collaboration diagram shows various objects that participate in a parsing process.
</p>
<center>
	<img src="collaboration.png"/>
</center>
<p>
	<code>XSOMParser</code> is the only publicly visible component in this picture. This class also keeps references to vairous other objects that are necessary to parse schemas. This includes an error handler, the root <code>SchemaSet</code> object, an entity resolver, etc.
</p><p>
	Whenever the parse method is called, it will create a new NGCCRuntimeEx and configure XMLReader so that a schema file is parsed into this NGCCRuntimeEx instance.
	
	<code>NGCCRuntimeEx</code> derives from <code>NGCCRuntime</code>, which is a class generated by RelaxNGCC. This object will use other RelaxNGCC-generated classes and parse a document and constructs a XSOM object graph appropriately.
</p><p>
	When a new XML document is referenced by an import or include statement, a new set of <code>NGCCRuntimeEx</code> is set up to parse that document. One NGCCRuntimeEx can only parse one XML document.
</p>
<h2>Forward references and back-patching</h2>
<p>
	Since we use SAX to parse schemas, the referenced schema component is often unavailable when we hit a reference. Because of this, when we see a reference, we create a "delayed" reference that keeps the name of the referenced component.
</p><p>
	Note that because of the way XML Schema &lt;redefine> works, all the references by name must be lazily bound even if the component is already defined.
</p><p>
	All these "delayed" references are remembered and tracked by XSOMParser. When the client calls the <code>XSOMParser.getResult</code> method, XSOMParser will make sure that they resolve to a schema component correctly.
	"Delayed" references are available in the <code>DelayedRef</code> class.
</p>


<h2>RelaxNGCC</h2>
<p>
	The actual parser is generated by RelaxNGCC from <code>xsom/src/*.rng</code> files. <code>xmlschema.rng</code> is the entry point and all the other files are referenced from this file. For more information about RelaxNGCC, goto <a href="http://relaxngcc.sourceforge.net/">here</a>. Or just contact me (as I'm one of the developers of RelaxNGCC.)
</p>

</body>
</html>
